Goto

Collaborating Authors

 geographic information


Trust in foundation models and GenAI: A geographic perspective

McKenzie, Grant, Janowicz, Krzysztof, Kessler, Carsten

arXiv.org Artificial Intelligence

Large-scale pre-trained machine learning models have reshaped our understanding of artificial intelligence across numerous domains, including our own field of geography. As with any new technology, trust has taken on an important role in this discussion. In this chapter, we examine the multifaceted concept of trust in foundation models, particularly within a geographic context. As reliance on these models increases and they become relied upon for critical decision-making, trust, while essential, has become a fractured concept. Here we categorize trust into three types: epistemic trust in the training data, operational trust in the model's functionality, and interpersonal trust in the model developers. Each type of trust brings with it unique implications for geographic applications. Topics such as cultural context, data heterogeneity, and spatial relationships are fundamental to the spatial sciences and play an important role in developing trust. The chapter continues with a discussion of the challenges posed by different forms of biases, the importance of transparency and explainability, and ethical responsibilities in model development. Finally, the novel perspective of geographic information scientists is emphasized with a call for further transparency, bias mitigation, and regionally-informed policies. Simply put, this chapter aims to provide a conceptual starting point for researchers, practitioners, and policy-makers to better understand trust in (generative) GeoAI.


OneLoc: Geo-Aware Generative Recommender Systems for Local Life Service

Wei, Zhipeng, Cai, Kuo, She, Junda, Chen, Jie, Chen, Minghao, Zeng, Yang, Luo, Qiang, Zeng, Wencong, Tang, Ruiming, Gai, Kun, Zhou, Guorui

arXiv.org Artificial Intelligence

Local life service is a vital scenario in Kuaishou App, where video recommendation is intrinsically linked with store's location information. Thus, recommendation in our scenario is challenging because we should take into account user's interest and real-time location at the same time. In the face of such complex scenarios, end-to-end generative recommendation has emerged as a new paradigm, such as OneRec in the short video scenario, OneSug in the search scenario, and EGA in the advertising scenario. However, in local life service, an end-to-end generative recommendation model has not yet been developed as there are some key challenges to be solved. The first challenge is how to make full use of geographic information. The second challenge is how to balance multiple objectives, including user interests, the distance between user and stores, and some other business objectives. To address the challenges, we propose OneLoc. Specifically, we leverage geographic information from different perspectives: (1) geo-aware semantic ID incorporates both video and geographic information for tokenization, (2) geo-aware self-attention in the encoder leverages both video location similarity and user's real-time location, and (3) neighbor-aware prompt captures rich context information surrounding users for generation. To balance multiple objectives, we use reinforcement learning and propose two reward functions, i.e., geographic reward and GMV reward. With the above design, OneLoc achieves outstanding offline and online performance. In fact, OneLoc has been deployed in local life service of Kuaishou App. It serves 400 million active users daily, achieving 21.016% and 17.891% improvements in terms of gross merchandise value (GMV) and orders numbers.


GeoRDF2Vec Learning Location-Aware Entity Representations in Knowledge Graphs

Boeckling, Martin, Paulheim, Heiko, Detzler, Sarah

arXiv.org Artificial Intelligence

Many knowledge graphs contain a substantial number of spatial entities, such as cities, buildings, and natural landmarks. For many of these entities, exact geometries are stored within the knowledge graphs. However, most existing approaches for learning entity representations do not take these geometries into account. In this paper, we introduce a variant of RDF2Vec that incorporates geometric information to learn location-aware embeddings of entities. Our approach expands different nodes by flooding the graph from geographic nodes, ensuring that each reachable node is considered. Based on the resulting flooded graph, we apply a modified version of RDF2Vec that biases graph walks using spatial weights. Through evaluations on multiple benchmark datasets, we demonstrate that our approach outperforms both non-location-aware RDF2Vec and GeoTransE.


Geo-Semantic-Parsing: AI-powered geoparsing by traversing semantic knowledge graphs

Nizzoli, Leonardo, Avvenuti, Marco, Tesconi, Maurizio, Cresci, Stefano

arXiv.org Artificial Intelligence

Online Social Networks (OSN) are privileged observation channels for understanding the geospatial facets of many real-world phenomena [1]. Unfortunately, in most cases OSN content lacks explicit and structured geographic information, as in the case of Twitter, where only a minimal fraction (1% to 4%) of messages are natively geotagged [2]. This shortage of explicit geographic information drastically limits the exploitation of OSN data in geospatial Decision Support Systems (DSS) [3]. Conversely, the prompt availability of geotagged content would empower existing systems and would open up the possibility to develop new and better geospatial services and applications [4, 5]. As a practical example of this kind, several social media-based systems have been proposed in recent years for mapping and visualizing situational information in the aftermath of mass disasters - a task dubbed as crisis mapping - in an effort to augment emergency response [6, 7]. These systems, however, demand geotagged data to be placed on crisis maps, which in turn imposes to perform the geoparsing task on the majority of social media content. Explicit geographic information is not only needed in early warning [8, 9] and emergency response systems [10, 11, 12, 13, 14], but also in systems and applications for improving event promotion [15, 16], touristic planning [17, 18, 19], healthcare accessibility [20], news aggregation [21] Post-print of the article published in Decision Support Systems 136, 2020. Please refer to the published version: doi.org/10.1016/j.dss.2020.113346


VLMs as GeoGuessr Masters: Exceptional Performance, Hidden Biases, and Privacy Risks

Huang, Jingyuan, Huang, Jen-tse, Liu, Ziyi, Liu, Xiaoyuan, Wang, Wenxuan, Zhao, Jieyu

arXiv.org Artificial Intelligence

Visual-Language Models (VLMs) have shown remarkable performance across various tasks, particularly in recognizing geographic information from images. However, significant challenges remain, including biases and privacy concerns. To systematically address these issues in the context of geographic information recognition, we introduce a benchmark dataset consisting of 1,200 images paired with detailed geographic metadata. Evaluating four VLMs, we find that while these models demonstrate the ability to recognize geographic information from images, achieving up to $53.8\%$ accuracy in city prediction, they exhibit significant regional biases. Specifically, performance is substantially higher for economically developed and densely populated regions compared to less developed ($-12.5\%$) and sparsely populated ($-17.0\%$) areas. Moreover, the models exhibit regional biases, frequently overpredicting certain locations; for instance, they consistently predict Sydney for images taken in Australia. The strong performance of VLMs also raises privacy concerns, particularly for users who share images online without the intent of being identified. Our code and dataset are publicly available at https://github.com/uscnlp-lime/FairLocator.


Towards better social crisis data with HERMES: Hybrid sensing for EmeRgency ManagEment System

Avvenuti, Marco, Bellomo, Salvatore, Cresci, Stefano, Nizzoli, Leonardo, Tesconi, Maurizio

arXiv.org Artificial Intelligence

People involved in mass emergencies increasingly publish information-rich contents in online social networks (OSNs), thus acting as a distributed and resilient network of human sensors. In this work we present HERMES, a system designed to enrich the information spontaneously disclosed by OSN users in the aftermath of disasters. HERMES leverages a mixed data collection strategy, called hybrid sensing, and state-of-the-art AI techniques. Evaluated in real-world emergencies, HERMES proved to increase: (i) the amount of the available damage information; (ii) the density (up to 7x) and the variety (up to 18x) of the retrieved geographic information; (iii) the geographic coverage (up to 30%) and granularity.


StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model

Li, Zongrong, Xu, Junhao, Wang, Siqin, Wu, Yifan, Li, Haiyang

arXiv.org Artificial Intelligence

Traditional machine learning has played a key role in geospatial predictions, but its limitations have become more distinct over time. One significant drawback of traditional ML is that they often rely on structured geospatial data, such as raster or vector formats, affecting their ability to handle unstructured or multimodal data (Pierdicca & Paolanti, 2022). Additionally, traditional models may face challenges in capturing complex spatial patterns and regional variations, leading to challenges with data sparsity and uneven distribution, which could affect the accuracy and generalizability of predictions (Nikparvar & Thill, 2021). In contrast, large language models (LLMs) have shown great promise across various fields by processing vast amounts of data and reasoning across multiple modalities (Chang et al., 2024). By integrating textual, visual, and contextual information, LLMs can introduce novel covariates for geospatial predictions, thus enhancing traditional approaches. However, extracting geospatial knowledge from LLMs poses its challenges. Although using geographic coordinates (i.e., latitude and longitude) was a straightforward way to retrieve location-specific information, this approach often yields suboptimal results, particularly when dealing with complex spatial relationships and regional characteristics. As a result, the traditional model does not easily to harness the full potential of multi-modal data, hindering its effectiveness in applications demanding comprehensive, cross-modal insights.


Application of Disentanglement to Map Registration Problem

Song, Hae Jin, Krawczuk, Patrycja, Huang, Po-Hsuan

arXiv.org Artificial Intelligence

Geospatial data come from various sources, such as satellites, aircraft, and LiDAR. The variability of the source is not limited to the types of data acquisition techniques, as we have maps from different time periods. To incorporate these data for a coherent analysis, it is essential to first align different "styles" of geospatial data to its matching images that point to the same location on the surface of the Earth. In this paper, we approach the image registration as a two-step process of (1) extracting geospatial contents invariant to visual (and any other non-content-related) information, and (2) matching the data based on such (purely) geospatial contents. We hypothesize that a combination of $\beta$-VAE-like architecture [2] and adversarial training will achieve both the disentanglement of the geographic information and artistic styles and generation of new map tiles by composing the encoded geographic information with any artistic style.


Fine-Grained Urban Flow Inference with Multi-scale Representation Learning

Yuan, Shilu, Li, Dongfeng, Liu, Wei, Zhang, Xinxin, Chen, Meng, Zhang, Junjie, Gong, Yongshun

arXiv.org Artificial Intelligence

Fine-grained urban flow inference (FUFI) is a crucial transportation service aimed at improving traffic efficiency and safety. FUFI can infer fine-grained urban traffic flows based solely on observed coarse-grained data. However, most of existing methods focus on the influence of single-scale static geographic information on FUFI, neglecting the interactions and dynamic information between different-scale regions within the city. Different-scale geographical features can capture redundant information from the same spatial areas. In order to effectively learn multi-scale information across time and space, we propose an effective fine-grained urban flow inference model called UrbanMSR, which uses self-supervised contrastive learning to obtain dynamic multi-scale representations of neighborhood-level and city-level geographic information, and fuses multi-scale representations to improve fine-grained accuracy. The fusion of multi-scale representations enhances fine-grained. We validate the performance through extensive experiments on three real-world datasets. The resutls compared with state-of-the-art methods demonstrate the superiority of the proposed model.


Geospatial Knowledge Graphs

Zhu, Rui

arXiv.org Artificial Intelligence

Geospatial knowledge graphs have emerged as a novel paradigm for representing and reasoning over geospatial information. In this framework, entities such as places, people, events, and observations are depicted as nodes, while their relationships are represented as edges. This graph-based data format lays the foundation for creating a "FAIR" (Findable, Accessible, Interoperable, and Reusable) environment, facilitating the management and analysis of geographic information. This entry first introduces key concepts in knowledge graphs along with their associated standardization and tools. It then delves into the application of knowledge graphs in geography and environmental sciences, emphasizing their role in bridging symbolic and subsymbolic GeoAI to address cross-disciplinary geospatial challenges. At the end, new research directions related to geospatial knowledge graphs are outlined.